Model Formulation: semCDI: A Query Formulation for Semantic Data Integration in caBIG

نویسندگان

  • E. Patrick Shironoshita
  • Yves R. Jean-Mary
  • Ray M. Bradley
  • Mansur R. Kabuka
چکیده

OBJECTIVES To develop mechanisms to formulate queries over the semantic representation of cancer-related data services available through the cancer Biomedical Informatics Grid (caBIG). DESIGN The semCDI query formulation uses a view of caBIG semantic concepts, metadata, and data as an ontology, and defines a methodology to specify queries using the SPARQL query language, extended with Horn rules. semCDI enables the joining of data that represent different concepts through associations modeled as object properties, and the merging of data representing the same concept in different sources through Common Data Elements (CDE) modeled as datatype properties, using Horn rules to specify additional semantics indicating conditions for merging data. Validation In order to validate this formulation, a prototype has been constructed, and two queries have been executed against currently available caBIG data services. DISCUSSION The semCDI query formulation uses the rich semantic metadata available in caBIG to build queries and integrate data from multiple sources. Its promise will be further enhanced as more data services are registered in caBIG, and as more linkages can be achieved between the knowledge contained within caBIG's NCI Thesaurus and the data contained in the Data Services. CONCLUSION semCDI provides a formulation for the creation of queries on the semantic representation of caBIG. This constitutes the foundation to build a semantic data integration system for more efficient and effective querying and exploratory searching of cancer-related data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Representation and Querying of caBIG Data Services

A computational grid infrastructure for biomedical research, called caGrid, is under development by the National Cancer Institute (NCI) as part of the cancer Biomedical Informatics Grid (caBIG) Initiative. In this paper we present a model that enables users to query an integrated view of caBIG data services at a conceptual semantic level. The model is based on semCDI, a formulation to generate ...

متن کامل

SPARQL Query Formulation and Execution using FedViz

Health care and life sciences research heavily relies on the ability to search, discover, formulate and correlate data from distinct sources. Although the Semantic Web and Linked Data technologies help in dealing with data integration problem, there remains a barrier adopting these for non-technical research audiences. In this paper we present FedViz, a visual interface for SPARQL query formula...

متن کامل

XML based Mediated Query Re-writing Framework

To integrate the information from heterogeneous data sources and give it a unified representation to the users is known as Information Integration. There are many application architectures that are designed for Enterprise Information Integration for solving the problems of semantic heterogeneity (the modeling problem) and query optimization (the querying problem) in Integration Architecture. Ar...

متن کامل

Using Unity to Semi-Automatically Integrate Relational Schema

Unity is an architecture for integrating relational databases that performs three processes: metadata capture, semantic integration, and query formulation and execution. The foundation of the architecture is a naming methodology that allows concepts to be integrated across systems. Semantic naming of schema constructs increases automation during integration and provides users with physical and ...

متن کامل

Developing a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information

With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of the American Medical Informatics Association : JAMIA

دوره 15 4  شماره 

صفحات  -

تاریخ انتشار 2008